Within this blog post, I will be investigating the affects of ozone concentrations within our atmosphere on the energy usage of the United States. This analysis will be conducted on data collected during the last six months of 2019. These data sets were obtained from the EPA and the EIA.
Before we load any data, understanding what ozone is and what it does is important. Ozone is within the Earth’s atmosphere and reduces the amount of harmful UV radiation that humans are exposed to, which can lead to sun burns and skin cancer (CDC, 2019). While this is only one of the risks that UV radiation presents to humans, it is one that is commonly known and it presents individuals with a reason potentially increase energy usage. As the sun raises to its highest point at noon, humans are exposed to more UV radtion, as a result of that, water intake may be increased and building temperatures may be decreased with air conditioners, both of which would lead to increased energy usage. While the specific applications that lead to increased energy usage will not be investigated, this analysis will see if there is any correlation between ozone concentrations and energy usage within the United States.
Before analysis can begin, the datasets must first be loaded into RStudio.
power_balance_data_url = 'https://www.eia.gov/realtime_grid/sixMonthFiles/EIA930_BALANCE_2019_Jul_Dec.csv'
power_balance_data = read.csv(url(power_balance_data_url))
# Obtained via the EPA site, specifying 'Aggregate Concentration Data' for the
# last 6 months of 2019 across all sites that they record data from. This data
# is provided as a file rather than a URL due to the inability to directly
# download this dataset.
ozone_daily_max_data = read.csv('./ozone-8-hour-daily-max.csv')
# Obtain date in identical format to power_balance_data.
ozone_daily_max_data$Date <- substr(ozone_daily_max_data$DDATE,1,10)
summary(power_balance_data)
## Balancing.Authority Data.Date Hour.Number
## AEC : 4417 11/03/2019: 1613 Min. : 1.0
## AECI : 4417 07/01/2019: 1560 1st Qu.: 7.0
## AVA : 4417 07/02/2019: 1560 Median :13.0
## AVRN : 4417 07/03/2019: 1560 Mean :12.5
## BANC : 4417 07/04/2019: 1560 3rd Qu.:19.0
## BPAT : 4417 07/05/2019: 1560 Max. :25.0
## (Other):260591 (Other) :277680
## Local.Time.at.End.of.Hour UTC.Time.at.End.of.Hour
## 11/03/2019 1:00:00 AM : 118 01/01/2020 1:00:00 AM : 65
## 01/01/2020 12:00:00 AM: 65 01/01/2020 12:00:00 AM: 65
## 07/01/2019 1:00:00 AM : 65 01/01/2020 2:00:00 AM : 65
## 07/01/2019 1:00:00 PM : 65 01/01/2020 3:00:00 AM : 65
## 07/01/2019 10:00:00 AM: 65 01/01/2020 4:00:00 AM : 65
## 07/01/2019 10:00:00 PM: 65 01/01/2020 5:00:00 AM : 65
## (Other) :286650 (Other) :286703
## Demand.Forecast..MW. Demand..MW. Net.Generation..MW.
## : 49109 : 46324 0 : 13569
## 71 : 307 29,849 : 504 -1 : 3709
## 73 : 294 67 : 307 : 2667
## 88 : 292 74 : 305 1 : 907
## 70 : 286 78 : 305 30,088 : 505
## 72 : 279 71 : 295 60 : 501
## (Other):236526 (Other):239053 (Other):265235
## Total.Interchange..MW. Sum.Valid.DIBAs...MW. Demand..MW...Imputed.
## -1 : 5102 : 5549 :282417
## 0 : 3245 -1 : 5089 0 : 3170
## : 2703 0 : 3212 43 : 13
## 1 : 1339 1 : 1315 44 : 12
## -2 : 1193 -2 : 1190 38 : 11
## 233 : 660 -3 : 649 42 : 11
## (Other):272851 (Other):270089 (Other): 1459
## Net.Generation..MW...Imputed. Demand..MW...Adjusted.
## :285033 : 45067
## 0 : 334 0 : 3389
## 159 : 14 29,849 : 504
## 160 : 10 74 : 310
## 136 : 9 67 : 309
## 181 : 8 78 : 306
## (Other): 1685 (Other):237208
## Net.Generation..MW...Adjusted. Net.Generation..MW..from.Coal
## 0 : 13748 :125292
## -1 : 3709 0 : 14047
## 1 : 914 53 : 913
## : 903 52 : 756
## 60 : 507 54 : 613
## 30,088 : 505 15 : 604
## (Other):266807 (Other):144868
## Net.Generation..MW..from.Natural.Gas Net.Generation..MW..from.Nuclear
## : 57753 :208495
## 0 : 16769 0 : 4774
## -1 : 4253 131 : 645
## 9 : 598 132 : 574
## 60 : 588 2,038 : 569
## 5,308 : 510 130 : 522
## (Other):206622 (Other): 71514
## Net.Generation..MW..from.All.Petroleum.Products
## :190817
## 0 : 69238
## -1 : 2411
## 59 : 1117
## 88 : 952
## 60 : 875
## (Other): 21683
## Net.Generation..MW..from.Hydropower.and.Pumped.Storage
## : 87116
## 0 : 20346
## 5 : 2138
## 1 : 2082
## 4 : 1779
## 14 : 1747
## (Other):171885
## Net.Generation..MW..from.Solar Net.Generation..MW..from.Wind
## :114396 :160074
## 0 : 75982 0 : 24798
## 1 : 5536 1 : 2292
## 2 : 3737 3 : 1586
## 4 : 3298 2 : 1491
## 3 : 3204 4 : 1057
## (Other): 80940 (Other): 95795
## Net.Generation..MW..from.Other.Fuel.Sources
## :135571
## 0 : 15026
## 11 : 3375
## 4 : 3141
## 17 : 2313
## 10 : 2271
## (Other):125396
## Net.Generation..MW..from.Unknown.Fuel.Sources
## Mode:logical
## NA's:287093
##
##
##
##
##
summary(ozone_daily_max_data)
## SITE_ID YEAR DDATE OZONE_8HR_DAILY_MAX
## BAS601 : 183 Min. :2019 08/15/2019 00:00:00: 83 Min. : 0.00
## SND152 : 183 1st Qu.:2019 08/24/2019 00:00:00: 83 1st Qu.:31.00
## CNT169 : 182 Median :2019 08/25/2019 00:00:00: 83 Median :39.00
## GLR468 : 182 Mean :2019 09/28/2019 00:00:00: 83 Mean :39.06
## GRB411 : 182 3rd Qu.:2019 09/29/2019 00:00:00: 83 3rd Qu.:46.00
## NPT006 : 182 Max. :2019 08/14/2019 00:00:00: 82 Max. :85.00
## (Other):13343 (Other) :13940
## RANK QA_CODE UPDATE_DATE Date
## Min. : 1.0 Min. :1.000 10/30/2019 14:45:06: 792 Length:14437
## 1st Qu.:111.0 1st Qu.:1.000 12/02/2019 10:55:20: 792 Class :character
## Median :200.0 Median :3.000 09/20/2019 13:01:01: 737 Mode :character
## Mean :191.8 Mean :2.236 09/20/2019 12:52:48: 87
## 3rd Qu.:276.0 3rd Qu.:3.000 10/30/2019 14:44:41: 62
## Max. :362.0 Max. :3.000 12/02/2019 10:54:57: 62
## (Other) :11905
The datasets appear to be loading as intended. Now we can analyze them.
There are over 60 power authorities for the United States, while each of these should be investigated and compared with one another, it would take far too long for the purpose of this exercise. Instead, five of the most major authorities will be used: CISO, PJM, ERCO, FPL, and MISO.
# Generic function to simplify plotting of power demand.
plot_power_demand_df <- function(df) {
ggplot(data=df,aes(x=df$Data.Date,y=as.numeric(df$Demand..MW.),color=df$Hour.Number)) +
geom_point() +
geom_smooth(method='lm') +
xlab('Date') +
ylab('Power Demand (Megawatts)') +
# Scale date by month to reduce number of labels.
scale_x_discrete(breaks=c('07/01/2019','08/01/2019','09/01/2019','10/01/2019','11/01/2019','12/01/2019'))
}
The EPA has over 80 sites across the United States that monitor ozone concentration within the atmosphere. Just as I did when looking at power demand, I will only look at 5 of these sites. These sites will be located in roughly the same region as the power authorities to allow for correlation between power demand and ozone concentration to be accurately asessed. The sites that will be analyzed are: LAV410, LRL117, ALC188, IRL141, and ALH157.
The sites IDs were obtained by using the CASTNET Site Locations Map.
# Generic function to simplify plotting of ozone concentration
plot_ozone_concentration_df <- function(df) {
ggplot(data=df,aes(x=df$Date,y=as.numeric(df$OZONE_8HR_DAILY_MAX))) +
geom_point() +
xlab('Date') +
ylab('Ozone Concentration (PPM)') +
# Scale date by month to reduce number of labels.
scale_x_discrete(breaks=c('07/01/2019','08/01/2019','09/01/2019','10/01/2019','11/01/2019','12/01/2019'))
}
Wrapping up this short investigation reveals that there may be a slight correlation between the two datasets that could lead to further investigation. However, there is not a large enough correlation to definitively say that power demand and ozone concentration within the atmosphere of the United States are correlated, or that one is a result of the other. There are obviously more factors than just ozone concentration that result in an increase or decrease in power demand, however, ozone concentration is likely not a factor at all. Instead, further investigation as to the age of infatsturcture with the authorities should conducted to investigate the efficiencies of the power facilities that currently exists to see if power demand can be meet, maintained, and decreased with more efficient and cheaper hardware.